Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for smp in multiple DMs #910

Closed
wants to merge 2 commits into from
Closed

Conversation

lz-bro
Copy link
Contributor

@lz-bro lz-bro commented Aug 28, 2023

Support for smp in multiple DMs by dealing with riscv hartid and index of each hart simply
Change-Id: 7c1b89f89d0f5ccb6031e754fc4bbec2d70d53c4

There is an issue about Support for smp in multiple DMs: #907

When I debug 2 DMs and each with 2 harts , hart count my config file is like

set _TARGETNAME_0 $_CHIPNAME.dm0cpu0
set _TARGETNAME_1 $_CHIPNAME.dm0cpu1
set _TARGETNAME_2 $_CHIPNAME.dm1cpu0
set _TARGETNAME_3 $_CHIPNAME.dm1cpu1

# cluster 0
target create $_TARGETNAME_0 riscv -chain-position $_CHIPNAME.cpu -coreid 0 -rtos hwthread
target create $_TARGETNAME_1 riscv -chain-position $_CHIPNAME.cpu -coreid 1

# cluster 1
target create $_TARGETNAME_2 riscv -chain-position $_CHIPNAME.cpu -coreid 4 -dbgbase 0x400 -rtos hwthread
target create $_TARGETNAME_3 riscv -chain-position $_CHIPNAME.cpu -coreid 5 -dbgbase 0x400

target smp $_TARGETNAME_0 $_TARGETNAME_1 $_TARGETNAME_2 $_TARGETNAME_3

this config file works with the code
info->index = target->coreid % dm->hart_count;
dm->hart_count was 4, so when I write 4/5 to coreid, I can write 0/1 to info->index. For each Debug Module, the index of each hart can start at 0 and be contiguous.
openocd log shown

(gdb) info threads
  Id   Target Id                                                            Frame
* 1    Thread 1 "riscv.dm0cpu0" (Name: riscv.dm0cpu0, state: debug-request) 0x0000000000001c6c in ?? ()
  2    Thread 2 "riscv.dm0cpu1" (Name: riscv.dm0cpu1, state: debug-request) 0x000000000000769a in ?? ()
  3    Thread 5 "riscv.dm1cpu0" (Name: riscv.dm1cpu0, state: debug-request) 0x000000000000769a in ?? ()
  4    Thread 6 "riscv.dm1cpu1" (Name: riscv.dm1cpu1, state: debug-request) 0x0000000000007698 in ?? ()
(gdb) c
Continuing.

@JanMatCodasip
Copy link
Collaborator

JanMatCodasip commented Aug 29, 2023

I'm sorry, but my understanding is that the SMP support in RISC-V is not designed to work across multiple debug modules. I am afraid that this simple change won't make it work. (@timsifive please correct me if I am mistaken.)

In any case, if harts are managed by multiple different DMs, it is not even possible to accomplish the near-instataneous halt & resume -- which is one of the main features of SMP debugging. So this is not just OpenOCD limitation, this is also a limitation of the given HW.

@lz-bro lz-bro closed this Aug 29, 2023
@lz-bro
Copy link
Contributor Author

lz-bro commented Aug 29, 2023

I'm sorry, but my understanding is that the SMP support in RISC-V is not designed to work across multiple debug modules. I am afraid that this simple change won't make it work. (@timsifive please correct me if I am mistaken.)

In any case, if harts are managed by multiple different DMs, it is not even possible to accomplish the near-instataneous halt & resume -- which is one of the main features of SMP debugging. So this is not just OpenOCD limitation, this is also a limitation of the given HW.

Thank you for your reply, this is not a perfect solution, because it can only accomplish the instataneous halt&resume of the harts on each cluster through hartwindow. But even if the external trigger in the debug 1.0 Spec supports hardware to realize simultaneous halt/resume, this problem will still exist. So my idea is to let openocd report error when user config cross-cluster smp (by another patch), or to support similar imperfect cross-cluster smp implementations(by this patch).

@lz-bro lz-bro reopened this Aug 29, 2023
Copy link
Collaborator

@timsifive timsifive left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can pass -coreid to target create commands, which means you don't need this change. Does that not work?

@@ -2027,6 +2026,10 @@ static int examine(struct target *target)
return ERROR_FAIL;
}

/* The RISC-V hartid is sequential, and the index of each hart
* on the Debug Module should start at 0 and be contiguous. */
info->index = target->coreid % dm->hart_count;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this assume that each DM has the same number of harts?

Copy link
Contributor Author

@lz-bro lz-bro Aug 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your reminder, I didn't consider the situation that each DM has different number of harts, if this situation happens, user configuration will become unfriendly. So I currently have two ideas to solve it:
Config modification
config like

# cluster 0
target create $_TARGETNAME_0 riscv -chain-position $_CHIPNAME.cpu -coreid 0 -index 0 -rtos hwthread
target create $_TARGETNAME_1 riscv -chain-position $_CHIPNAME.cpu -coreid 1 -index 1

# cluster 1
target create $_TARGETNAME_2 riscv -chain-position $_CHIPNAME.cpu -coreid 2 -index 0 -dbgbase 0x400 -rtos hwthread
target create $_TARGETNAME_3 riscv -chain-position $_CHIPNAME.cpu -coreid 3 -index 1 -dbgbase 0x400

target smp $_TARGETNAME_0 $_TARGETNAME_1 $_TARGETNAME_2 $_TARGETNAME_3

Modify the 'jim_target_create' function to get the target->index.

info->index get
https://github.com/riscv/riscv-openocd/blob/699eecaab434337dc3915171606b0548c48c6d51/src/target/riscv/riscv-013.c#L1900
changed to

info->index = target->index;

Get different threadids through hartid
config like

# cluster 0
target create $_TARGETNAME_0 riscv -chain-position $_CHIPNAME.cpu -coreid 0 -rtos hwthread
target create $_TARGETNAME_1 riscv -chain-position $_CHIPNAME.cpu -coreid 1

# cluster 1
target create $_TARGETNAME_2 riscv -chain-position $_CHIPNAME.cpu -coreid 0 -dbgbase 0x400 -rtos hwthread
target create $_TARGETNAME_3 riscv -chain-position $_CHIPNAME.cpu -coreid 1 -dbgbase 0x400

target smp $_TARGETNAME_0 $_TARGETNAME_1 $_TARGETNAME_2 $_TARGETNAME_3

Make appropriate modifications to the "threadid_from_target' function, I still don’t know how to modify it

static inline threadid_t threadid_from_target(const struct target *target)
{
	#return target->coreid + 1;
        ...
}

thread id should be 0/1/2/3

@lz-bro
Copy link
Contributor Author

lz-bro commented Aug 31, 2023

I think you can pass -coreid to target create commands, which means you don't need this change. Does that not work?

This patch about Support for smp in multiple DMs: #907, cross-cluster smp cannot be realized by passing -coreid to target create commands.

@timsifive
Copy link
Collaborator

This patch about Support for smp in multiple DMs: #907, cross-cluster smp cannot be realized by passing -coreid to target create commands.

That's not obvious to me. What happens when you pass coreid like this:

# cluster 0
target create $_TARGETNAME_0 riscv -chain-position $_CHIPNAME.cpu -coreid 0 -rtos hwthread
target create $_TARGETNAME_1 riscv -chain-position $_CHIPNAME.cpu -coreid 1

# cluster 1
target create $_TARGETNAME_2 riscv -chain-position $_CHIPNAME.cpu -coreid 2 -dbgbase 0x400 -rtos hwthread
target create $_TARGETNAME_3 riscv -chain-position $_CHIPNAME.cpu -coreid 3 -dbgbase 0x400

@lz-bro
Copy link
Contributor Author

lz-bro commented Sep 1, 2023

This patch about Support for smp in multiple DMs: #907, cross-cluster smp cannot be realized by passing -coreid to target create commands.

That's not obvious to me. What happens when you pass coreid like this:

# cluster 0
target create $_TARGETNAME_0 riscv -chain-position $_CHIPNAME.cpu -coreid 0 -rtos hwthread
target create $_TARGETNAME_1 riscv -chain-position $_CHIPNAME.cpu -coreid 1

# cluster 1
target create $_TARGETNAME_2 riscv -chain-position $_CHIPNAME.cpu -coreid 2 -dbgbase 0x400 -rtos hwthread
target create $_TARGETNAME_3 riscv -chain-position $_CHIPNAME.cpu -coreid 3 -dbgbase 0x400

info->index stores the index of each hart on the DM. For each DM these must start at 0 and be contiguous. So that then means that to select the first hart on cluster 1 (DM 1), I write 2 to dmcontrol.hartsel, it won't work. The first hart in a DM must be selected by dmcontrol.hartsel=0, the second by dmcontrol.hartsel=1, and so on.

@timsifive
Copy link
Collaborator

OK, so what happens if you do:

# cluster 0
target create $_TARGETNAME_0 riscv -chain-position $_CHIPNAME.cpu -coreid 0 -rtos hwthread
target create $_TARGETNAME_1 riscv -chain-position $_CHIPNAME.cpu -coreid 1

# cluster 1
target create $_TARGETNAME_2 riscv -chain-position $_CHIPNAME.cpu -coreid 0 -dbgbase 0x400 -rtos hwthread
target create $_TARGETNAME_3 riscv -chain-position $_CHIPNAME.cpu -coreid 1 -dbgbase 0x400

@lz-bro
Copy link
Contributor Author

lz-bro commented Sep 6, 2023

OK, so what happens if you do:

# cluster 0
target create $_TARGETNAME_0 riscv -chain-position $_CHIPNAME.cpu -coreid 0 -rtos hwthread
target create $_TARGETNAME_1 riscv -chain-position $_CHIPNAME.cpu -coreid 1

# cluster 1
target create $_TARGETNAME_2 riscv -chain-position $_CHIPNAME.cpu -coreid 0 -dbgbase 0x400 -rtos hwthread
target create $_TARGETNAME_3 riscv -chain-position $_CHIPNAME.cpu -coreid 1 -dbgbase 0x400
target smp $_TARGETNAME_0 $_TARGETNAME_1
target smp $_TARGETNAME_2 $_TARGETNAME_3

If I only smp harts in the a DM, 2 gdb were used to debug, nothing happens

target smp $_TARGETNAME_0 $_TARGETNAME_1$_TARGETNAME_2 $_TARGETNAME_3
But If I smp them all together, 1 gdb can only see 2 core in dm1

(gdb) info threads
  Id   Target Id                                                            Frame
* 1    Thread 1 "riscv.dm1cpu0" (Name: riscv.dm1cpu0, state: debug-request) 0x0000000000001c6c in ?? ()
  2    Thread 2 "riscv.dm1cpu1" (Name: riscv.dm1cpu1, state: debug-request) 0x000000000000769a in ?? ()

So the harts in multiple DMs cannot be smp to halt/resume together, and I want to get some advice about this

@timsifive
Copy link
Collaborator

Hmm.... the comment for target.core_id says which device on the TAP?. That implies that you could have multiple targets with the same coreid if they are on different TAPs. struct gdb_service in target.h then implies that gdb requires a unique coreid for each target. That seems like a bug because there can be multiple TAPs, although perhaps I'm missing some code. I only took a quick look.

Is that fixable? That seems like a better fix than RISC-V-specific heuristics.

@lz-bro
Copy link
Contributor Author

lz-bro commented Sep 12, 2023

Hmm.... the comment for target.core_id says which device on the TAP?. That implies that you could have multiple targets with the same coreid if they are on different TAPs. struct gdb_service in target.h then implies that gdb requires a unique coreid for each target. That seems like a bug because there can be multiple TAPs, although perhaps I'm missing some code. I only took a quick look.

Is that fixable? That seems like a better fix than RISC-V-specific heuristics.

As I understand, target->coreid is used to specify which device on the TAP and define a unique threadid (dgb) for each target. If I smp all target on multiple clusters, there is a bug.
I gave some ideas in previous comments.

@lz-bro
Copy link
Contributor Author

lz-bro commented Sep 21, 2023

thanks, we will try to fix it in upstream openocd

@lz-bro lz-bro closed this Sep 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants